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(as amended September 7, 2000) 

1. (Twice Amended) A method of determining a consensus profile for perturbations 
to a cell type or organism, said method comprising identifying common response motifs 
among sets of cellular constituents in a plurality of response profiles, each response profile in 
said plurality of response profiles (i) comprising measurements of a plurality of cellular 
constituents, and (ii) resulting from a different perturbation to said type of cell or organism, 
wherein each of said sets of cellular constituents consists of cellular constituents that co-vary 
under a plurality of perturbations or that are co-regulated, and wherein said common response 
motifs constitute the consensus profile for said perturbations. 

2. The method of claim 1, wherein the plurality of response profiles comprises at least 
five response profiles, 

3. The method of claim 2, wherein the plurality of response profiles comprises more 
than ten response profiles. 

4. The method of claim 3, wherein the plurality of response profiles comprises more 
than 50 response profiles. 

5. The method of claim 4, wherein the plurahty of response profiles comprises more 
than 100 response profiles. 

6. The method of claim 1, wherein the perturbations are associated with a particular 
biological effect. 

7. The method of claim 6, wherein the particular biological effect is the effect of a 
particular class or type of drug. 
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8. The method of claim 6, wherein the particular biological effect is a therdpeutic 

effect. 

9. The method of claim 6, wherein the particular biological effect is a toxic effect. 

10. (Twice Amended) The method of claim U wherem each of the sets of cellular 
constituents consists of cellular constituents which are co regulated. 

1 1 . (Twice Amended) The method of claim 1, wherem each of the sets of cellular 
constituents consists of cellular constituents which co-vary in the plurality of response 
profiles. 

12. The method of claim 1 1 , wherein the cellular constituents which co-vary are 
identified by cluster analysis of cellular constituents in the plurality of response profiles. 

13. The method of claim 12, wherein the cluster analysis is done by means of a 
clustering algorithm. 

14. The method of claim 13, wherein the clustering algorithm is hclust. 

15. The method of claim 12, wherein said cluster analysis determines a clustering tree, 
the cellular constituents which co-vary comprising branches of said clustering tree. 

16. The method of claim 1 5, wherem the sets of co-varying cellular constituents are 
selected from a branching level of the clustering tree. 

17. The method of claim 12, wherein a statistical significance for the sets of co- 
varying cellular constituents is determined by means of an objective statistical test. 

IS. The method of claim 17, wherein the objective statistical test comprises; 
(a) determining an actual fractional improvement in cluster analysis of the cellular 
constituents; 
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(b) generating permuted response of cellular constituents by means of Monte 
Carlo randomization of perturbation index for the response of each cellular 
constituent across all perturbations; 

(c) performing cluster analysis on the permuted response of cellular constituents; 

(d) determining the fractional improvement m the cluster analysis on the permuted 
response of cellular constituents; and 

(e) repeatmg said steps of generating permuted response of cellular constituents 
and performing cluster analysis on the permuted response of cellular 
constituents so that a distribution of fractional improvements is obtained; 

wherein the statistical significance is determined by comparing the actual fractional 
improvement to the distribution of fractional improvements. 

19. The method of claim 1, wherein the common response motifs are identified by re- 
ordering the response profiles into sets associated with similar biological effects. 

20. The method of claim 19, wherein the sets of response profiles associated with 
similar biological effects are identified by cluster analysis of the response profiles, 

21 . The method of claim 20, wherein the cluster analysis is done by means of a 
clustering algorithm. 

22. The method of claim 21, wherein the clustering algorithm is hclust, 

23. The method of claim 20, wherein said cluster analysis determines a clustering tree, 
the response profiles associated with similar biological effects comprising branches of said 
clustering tree. 

24. The method of claim 23, wherein the branches are selected by applying a cutting 
level across said clustering tree, said cuttmg level being determined by an expected number 
of biological pathways represented by the sets of cellular constituents. 

25. The method of claim 20, wherein a statistical significance for the sets of response 
profiles is determined by means of an objective statistical test. 
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26. The method of claim 25, wherein the objective statistical test comprises: 

(a) determining an actual fractional improvement m the cluster analysis of the 
response profiles; 

(b) generating permuted response profiles by means of Monte Carlo 
randomization of cellular constituent index for each response profile across the 
measured cellular constituents; 

(c) performing cluster analysis on the permuted response profiles; 

(d) determining the fractional improvement in the cluster analysis on the permuted 
response profiles; and 

(e) repeating said steps of generating permuted response profiles and performing 
cluster analysis on the permuted response profiles so that a distribution of 
fractional improvements is obtained; 

wherein the statistical significance is determined by comparing the actual fractional 
improvement to the distribution of fractional improvements. 

27. The method of claim 1, wherein the sets of cellular constituents are basis celJular 
constituent sets. 

28. The method of claim 27, wherein the basis cellular constituent sets are gencsets. 

29. A method of determining a consensus profile for perturbations to a cell type or 
organism, said method comprising identifying common response motifs among a plurality of 
projected profiles, each projected profile in said plurality of projected profiles 

(i) resulting from a different perturbation to said type of cell or organism, and 

(ii) compnsmg measurements of a plurality of cellular constituents in said type of cell 
or organism that have been projected onto basis cellular constituent sets, said basis cellular 
constituent sets being defined by co-variation of measurements of cellular constituents under 
a plurality of different perttirbations, wherein said common response motifs constitute the 
consensus profile for said perturbations. 

30 The method of claim 1 wherein the consensus profile is the intersection of the sets 
of cellular constituents activated or de-activated in the common response motifs. 



31. The method of claim 29, wherein the consensus profile is the intersection of the 
sets of cellular constituents activated or de-activated on the common response motifs. 

32. The method of claim 30 or 31, wherein the common response motifs are identified 
by re -ordering the response profiles into sets associated with similar biological effects. 

33. The method of claim 31, wherein the intersection is identified by visual inspection 
of the plurahty of projected response profiles. 

34. The method of claim 32, wherein the intersection is identified by visual inspection 
of the plurality of projected response profiles. 

35. The method of claim 31, wherein the intersection is identified by thresholding the 
projected response profiles. 

36. The method of claim 31, wherein the intersection is identified arithmetically. 

37. The method of claim 36, wherein the intersection is identified by a method 
comprising: 

(a) replacing amplitudes of cellular constituent sets in the projected response 
profiles that arc above a threshold with values of unity; 

(b) replacing amplitudes of cellular constituent sets in the projected response 
profiles that arc below said threshold with values of zero; and 

(c) determining the element-wise product of the projected response profiles, 
wherein the element- wise product of the projected response profiles is the intersection. 

38. (Twice Amended) A method of determining a consensus profile for perturbations 
to a cell type or organism, said method comprising identifying common response motifs 
among sets of genes in a plurality of response profiles, each response profile in said plurality 
of response profiles (i) compnsing measurements of transcript levels for a plurality of genes, 
and (ii) resulting from a different perturbation to said type of cell or organism, w^herein each 
of said sets of genes consists of genes that co-vary under a plurality of perturbations or that 
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arc co-regulated, and wherein said common response motifs constitute the consensus profile 
for said perturbations. 

39. A method for comparing a biological response profile to a consensus profile, said 
consensus profile comprising common response motifs among a plurality of projected 
response profiles, each projected response profile m said plurality of projected response 
profiles 

(i) resulting from a different perturbation to said type of cell or organism, and 

(ii) comprising measurements of a plurality of cellular constituents in said type of cell 
or organism that have been projected onto basis cellular constituent sets, said basis cellular 
constituent sets being defined by co-variation of measurements of cellular constituents under 
a plurality of difTerenl perturbations, wherein said common response motifs constitute the 
consensus profile for said perturbations, said method comprising: 

(a) converting the biological response profile into a projected response profile by 
projecting measurements of cellular constituents in said biological response 
profile onto said basis cellular constituent sets; and 

(b) determining the value of a similarity metric between the projected 
response profile and the consensus profile. 

40. The method of claim 39, wherein said step of converting comprising projecting 
the biologicaJ response profile onto the basis cellular constituent sets. 

41. The method of claim 39, wherein the similarity metric is the generalized cosine 
angle between the projected response profile and the consensus profile. 

42. The method of claim 39, further comprising a step of determining the statistical 
significance of the similarity metric. 

43. The method of claim 42, wherein the statistical significance is assessed using an 
empirical probability of distribution generated under a null hypothesis of no correlation. 

44. (Twice Amended) A method for grouping measured response profiles in sets 
which are associated with similar biological effects comprising grouping response profiles 



into sets among a plurality of response profiles, each of said sets of response profiles 
consisting of response profiles in which the responses of one or more sets of cellular 
constituents in each response profile are similar among response profiles in the set, each 
response profile in said plurality of response profiles (i) composing measurements of a 
plurahty of cellular constituents, and (ii) resulting from a different perturbation, wherein each 
of said sets of cellular constituents consists of cellular constituents that co-vary under a 
plurality of perturbations or that are co-regulated. 

45. The method of claim 44, wherein the sets of response profiles are identified by 
cluster analysis of the response profiles. 

46. The method of claim 45, wherein the cluster analysis is done by means of a 
clustering algorithm. 

47. (Amended) The method of claim 46, wherein the clustering algorithm is hclust. 

48. The method of claim 45, wherein said cluster analysis determines a clustering tree, 
the sets of response profiles comprising branches of said clustering tree. 

49. The method of claim 45, wherein a statistical significance for the sets of response 
profiles is determined by means of an objective statistical test. 

50. The method of claim 49, wherein the objective statistical test comprises: 

(a) determining an actual fractional improvement in the cluster analysis of the 
response profiles; 

(b) generating permuted response profiles by means of Monte Carlo 
randomization of cellular constituent index for each response profile across 
the measured cellular constituents; 

(c) performing cluster analysis on the permuted response profiles; 

(d) determining the fractional improvement in The cluster analysis of the 
permuted response profiles; and 
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(e) rqjeating said steps of generaiing permuted response profiles and performing 
cluster analysis on the permuted response profiles so that a distribution of 
fractional improvements is obtained; 

wherein the statistical significance is determined by companng the actual fractional 

improvement to the distribution of fractional improvements. 

58. (Twice Amended) A method for detennming the therapeutic efficacy of a drug or 
drug candidate comprising identifying one or more groups of sets of cellular constituents in 
one or more response profiles associated with exposure to the drug or drug candidate, each 
response profile comprising measurements of a plurality of cellular constituents, wherein 
each of said groups is indicative of a particular therapeutic effect, and wherein the therapeutic 
effect of the drug or drug candidate is determined to be the particular therapeutic effect 
indicated by the identified groups, wherein each of said sets of cellular constituents consists 
of cellular constituents that co-vary under a plurality of perturbations or that are co-regulated. 

59. The method of claim 58, wherein the sets of cellular constituents are determined 
by a method comprising performing cluster analysis of the response profiles. 

60. The method of claim 59, wherein the cluster analysis is done by means of a 
clustering algorithm. 

61 . The method of claim 60, wherein the clustering algorithm is hclust. 

62. The method of claim 59, wherein said cluster analysis determines a clustering tree, 
the sets of cellular constituents comprising branches of said clustering tree. 

63. The method of claim 59, wherein a statistical significance for the sets of cellular 
constituents is determined by means of an objective statistical test. 

64. The method of claim 63, wherein the objective statistical test comprises: 

(a) determining an actual fractional improvement in the cluster analysis of cellular 
constituents; 
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(b) generating permuted response of cellular constituents by means of Monte 
Carlo randomization of the perturbation index for each cellular constituent 
across all perturbations; 

(c) perfomaing cluster analysis on the permuted response of cellular constituents; 

(d) determining the fractional improvement in the cluster analysis of the permuted 
response of cellular constituents; and 

(e) repeating said steps of generating permuted response of cellular constituents 
and performing cluster analysis on the permuted response of cellular 
constituents so that a distribution of fractional improvements is obtained; 

wherein the statistical significance is determined by comparing the actual fractional 
improvement to the distribution of fractional improvements. 

72. A method for analyzing response data from a biological sample comprising 

(a) grouping cellular constituents from the biological sample into sets of cellular 
constituents that co-vary in a plurality of response profiles, each response 
profile in said plurahty of response profiles (i) comprising measurements of a 
plurality of cellular constituenls, and (ii) resulting from a different 
perturbation to said biological sample, and 

(b) grouping the plurality of response profiles into sets of response profiles that 
similarly affect cellular constituents. 

73. The method of claim 72, wherein one or more cellular constituents which co-vary 
m association with a particular biological effect are identified from the sets of cellular 
constituents that co-vary in said plurality of response profiles. 

74. The method of claim 72, wherein one or more response profiles that are associated 
with a particular biological effect are identified from the sets of response profiles that 
similarly affect cellular constituents. 

75. The method of claim 73 or 74, wherein the particular biological effect is an effect 
on a biological pathway. 



76. The method of claim 73, wherem The cellular constituents from the biological 
sample compose a plurality of genes or gene transcnpts, and one or more genes associated 
with said biological effect are identified. 

77. The method of claim 76 wherem the one or more genes identified compnse known 

genes. 

78. The method of claim 76, wherein the one or more genes identified comprise 
previously unknown genes. 

89. The method of claim I, wtoein said sets of cellular constituents are co-varying 
cellular constituent sets. 

90. The method of claim 89, wherein the cellular constituents which co-vary are 
identified by cluster analysis. 

91. The method of claim 89, wherein the cluster analysis is done by means of a 
clustering algorithm. 

92. The method of claim 91, wherein the clustenng algonthm is hclusi. 

93. The method of claim 90, wherein said cluster analysis dctcrmmcs a clustering tree, 
the cellular constiments which co-vary comprising branches of said clustenng tree. 

94. The method of claim 93, wherein the sets of co-varying cellular constituents are 
selected from a branching level of the clustering tree. 

95. The method of claim 90, wherein a statistical significance for the sets of co- 
varying cellular constituents is determined by means of an objective statistical test. 

96. The method of claim 95, wherein the objective statistical test comprises; 
(a) determining an acmal fractional improvement in cluster analysis of the 

cellular constituents; 



(b) generating permuted response of cellular constituents by means of Monte 
Carlo randomization of the perturbation mdex for response of each cellular 
constituent across the set of perturbations, 

(c) perfornung cluster analysis on the permuted response of cellular constituents, 

(d) determining the fractional improvement in the cluster analysis on the permuted 
response of cellular constituents; and 

(e) repeating said steps of generating permuted response of cellular constituents 
and performing cluster analysis on the permuted response of cellular 
constiments so that a distribution of fractional improvements is obtained, 

wherein the statistical significance is determined by comparing the actual fractional 
improvement to the distribution of fractional improvements. 

97. The method of claim 39, 40, 41, 42, or 43, wherein said sets of co-varying cellular 
constituents comprise cellular constituents which co-vary in the plurality of response 
profiles. 

98- The method of claim 72, wherein step (a) is carried out before step (b). 

99. The method of claim 72, wherein step (b) is earned out before step (a), 

100. (Amended) A method of grouping sets of perturbations that similarly affect 
cellular constituents in a cell type or organism among a plurality of perturbations comprising 
grouping response profiles among a plurality of response profiles in sets, each of said sets of 
response profiles consisting of response profiles in which the responses of one or more sets of 
cellular constituents are similar among the response profiles in the set, each response profile 
in said plurality of response profiles (i) comprismg measurements of a plurality of cellular 
constituents, and (ii) resulting from a different perturbation, wherein each of said sets of 
cellular constituents consists of cellular constituents that co-vary under a plurality of 
perturbations or that are co-regulated, thereby grouping said sets of perturbations. 

101 . A method for grouping measured response profiles in sets which are associated 
with similar biological effects comprising grouping response profiles m sets among a 
plurality of response profiles by cluster analysis of said plurality of response profiles, said 
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sets of response profiles consisting of response profiles having similar responses of a group 
of cellular constituents, each response profile m said plurality of response profiles (i) 
compnsmg measurements of a plurality of cellular constituents, and (ii) resulting from a 
different perturbation. 

102. The method of claim 101, wherein the cluster analysis is done by means of a 
clustering algorithm. 

103. The method of claim 102, wherein the clustering algorithm is hclust, 

104. The method of claim 101, wherein said cluster analysis determines a clustering 
tree, the sets of response profiles comprising branches of said clustering tree. 

105. The method of claim 101, wherein a statistical significance for the sets of 
response profiles is determined by means of an objective statistical test. 

106. The method of claim 105, wherein the objective statistical test comprises; 

(a) determining an actual fractional improvement m the cluster analysis of the 
response profiles; 

(b) generating permuted response profiles by means of Monte Carlo 
randomization of cellular constituent index for each response profile across 
the measured cellular constituents; 

(c) performing cluster analysis on the permuted response profiles; 

(d) determining the fractional improvement in the cluster analysis of the 
permuted response profiles; and 

(e) repeating said steps of generating permuted response profiles and performing 
cluster analysis on the permuted response profiles so that a distribution of 
fractional improvements is obtained; 

wherein the statistical significance is determined by comparing the actual fractional 
improvement to the distribution of fractional improvements. 



